21 research outputs found

    On the Role of Social Identity and Cohesion in Characterizing Online Social Communities

    Get PDF
    Two prevailing theories for explaining social group or community structure are cohesion and identity. The social cohesion approach posits that social groups arise out of an aggregation of individuals that have mutual interpersonal attraction as they share common characteristics. These characteristics can range from common interests to kinship ties and from social values to ethnic backgrounds. In contrast, the social identity approach posits that an individual is likely to join a group based on an intrinsic self-evaluation at a cognitive or perceptual level. In other words group members typically share an awareness of a common category membership. In this work we seek to understand the role of these two contrasting theories in explaining the behavior and stability of social communities in Twitter. A specific focal point of our work is to understand the role of these theories in disparate contexts ranging from disaster response to socio-political activism. We extract social identity and social cohesion features-of-interest for large scale datasets of five real-world events and examine the effectiveness of such features in capturing behavioral characteristics and the stability of groups. We also propose a novel measure of social group sustainability based on the divergence in group discussion. Our main findings are: 1) Sharing of social identities (especially physical location) among group members has a positive impact on group sustainability, 2) Structural cohesion (represented by high group density and low average shortest path length) is a strong indicator of group sustainability, and 3) Event characteristics play a role in shaping group sustainability, as social groups in transient events behave differently from groups in events that last longer

    Simultaneous Detection of Communities and Roles from Large Networks

    Get PDF
    ABSTRACT Community detection and structural role detection are two distinct but closely-related perspectives in network analytics. In this paper, we propose RC-Joint, a novel algorithm to simultaneously identify community and structural role assignments in a network. Rather than being agnostic to one assignment while inferring the other, RC-Joint employs a principled approach to guide the detection process in a nonparametric fashion and ensures that the two sets of assignments are sufficiently different from each other. Roles and communities generated by RC-Joint are both soft assignments, reflecting the fact that many real-world networks have overlapping community structures and role memberships. By comparing with state-of-the-art methods in community detection and structural role detection, we demonstrate that RC-Joint harvests the best of two worlds and outperforms existing approaches, while still being competitive in efficiency. We also investigate the effect of different initialization schemes, and find that using the results of RCJoint on a sparse network as the seed often leads to faster convergence and higher quality

    Efficient Community Detection in Large Networks using Content and Links

    No full text
    In this paper we discuss a very simple approach of combining content and link information in graph structures for the purpose of community discovery, a fundamental task in network analysis. Our approach hinges on the basic intuition that many networks contain noise in the link structure and that content information can help strengthen the community signal. This enables ones to eliminate the impact of noise (false positives and false negatives), which is particularly prevalent in online social networks and Web-scale information networks. Specifically we introduce a measure of signal strength between two nodes in the network by fusing their link strength with content similarity. Link strength is estimated based on whether the link is likely (with high probability) to reside within a community. Content similarity is estimated through cosine similarity or Jaccard coefficient. We discuss a simple mechanism for fusing content and link similarity. We then present a biased edge sampling procedure which retains edges that are locally relevant for each graph node. The resulting backbone graph can be clustered using standard community discovery algorithms such as Metis and Markov clustering. Through extensive experiments on multiple real-world datasets (Flickr, Wikipedia and CiteSeer) with varying sizes and characteristics, we demonstrate the effectiveness and efficiency of our methods over state-of-the-art learning and mining approaches several of which also attempt to combine link and content analysis for the purposes of community discovery. Specifically we always find a qualitative benefit when combining content with link analysis. Additionally our biased graph sampling approach realizes a quantitative benefit in that it is typically several orders of magnitude faster than competing approaches

    Prediction of Topic Volume on Twitter

    Get PDF
    We discuss an approach for predicting microscopic (individual) and macroscopic (collective) user behavioral patterns with respect to specific trending topics on Twitter. Going beyond previous efforts that have analyzed driving factors in whether and when a user will publish topic-relevant tweets, here we seek to predict the strength of content generation which allows more accurate understanding of Twitter users\u27 behavior and more effective utilization of the online social network for diffusing information. Unlike traditional approaches, we consider multiple dimensions into one regression-based prediction framework covering network structure, user interaction, content characteristics and past activity. Experimental results on three large Twitter datasets demonstrate the efficacy of our proposed method. We find in particular that combining features from multiple aspects (especially past activity information and network features) yields the best performance. Furthermore, we observe that leveraging more past information leads to better prediction performance, although the marginal benefit is diminishing

    On Understanding the Divergence of Online Social Group Discussion

    No full text
    We study online social group dynamics based on how group members diverge in their online discussions. Previous studies mostly focused on the link structure to characterize social group dynamics, whereas the group behavior of content generation in discussions is not well understood. Particularly, we use Jensen-Shannon (JS) divergence to measure the divergence of topics in user-generated contents, and how it progresses over time. We study Twitter messages (tweets) in multiple real-world events (natural disasters and social activism) with different times and demographics. We also model structural and user features with guidance from two socio-psychological theories, social cohesion and social identity, to learn their implications on group discussion divergence. Those features show significant correlation with group discussion divergence. By leveraging them we are able to construct a classifier to predict the future increase or decrease in group discussion divergence, which achieves an area under the curve (AUC) of 0.84 and an F-1 score (harmonic mean of precision and recall) of 0.8. Our approach allows to systematically study collective diverging group behavior independent of group formation design. It can help to prioritize whom to engage with in communities for specific topics of needs during disaster response coordination, and for specific concerns and advocacy in the brand management

    Community Discovery: Simple and Scalable Approaches

    No full text
    The increasing size and complexity of online social networks have brought distinct challenges to the task of community discovery. A community discovery algorithm needs to be efficient, not taking a prohibitive amount of time to finish. The algorithm should also be scalable, capable of handling large networks containing billions of edges or even more. Furthermore, a community discovery algorithm should be effective in that it produces community assignments of high quality. In this chapter, we present a selection of algorithms that follow simple design principles, and have proven highly effective and efficient according to extensive empirical evaluations. We start by discussing a generic approach of community discovery by combining multilevel graph contraction with core clustering algorithms. Next we describe the usage of network sampling in community discovery, where the goal is to reduce the number of nodes and/or edges while retaining the network’s underlying community structure. Finally, we review research efforts that leverage various parallel and distributed computing paradigms in community discovery, which can facilitate finding communities in tera- and peta-scale networks

    Understanding User-Community Engagement by Multi-faceted Features: A Case Study on Twitter

    No full text
    The widespread use of social networking websites in recent years has suggested a need for effective methods to understand the new forms of user engagement, the factors impacting them, and the fundamental reasons for such engagements. We perform exploratory analysis on Twitter to understand the dynamics of user engagement by studying what attracts a user to participate in discussions on a topic. We identify various factors which might affect user engagement, ranging from content properties, network topology to user characteristics on the social network, and use them to predict user joining behavior. As opposed to traditional ways of studying them separately, these factors are organized in our framework, People-Content-Network Analysis (PCNA), mainly designed to enable understanding of human social dynamics on the web. We perform experiments on various Twitter user communities formed around topics from diverse domains, with varied social significance, duration and spread. Our findings suggest that capabilities of content, user and network features vary greatly, motivating the incorporation of all the factors in user engagement analysis, and hence, a strong need can be felt to study dynamics of user engagement by using the PCNA framework. Our study also reveals certain correlation between types of event for discussion topics and impact of user engagement factors
    corecore